Towards Dynamic Word Sense Discrimination with Random Indexing

نویسندگان

  • Hans Moen
  • Erwin Marsi
  • Björn Gambäck
چکیده

Most distributional models of word similarity represent a word type by a single vector of contextual features, even though, words commonly have more than one sense. The multiple senses can be captured by employing several vectors per word in a multi-prototype distributional model, prototypes that can be obtained by first constructing all the context vectors for the word and then clustering similar vectors to create sense vectors. Storing and clustering context vectors can be expensive though. As an alternative, we introduce Multi-Sense Random Indexing, which performs on-the-fly (incremental) clustering. To evaluate the method, a number of measures for word similarity are proposed, both contextual and non-contextual, including new measures based on optimal alignment of word senses. Experimental results on the task of predicting semantic textual similarity do, however, not show a systematic difference between singleprototype and multi-prototype models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Sense Discrimination Using Context Vector Similarity

This paper presents the application of context vector similarity for the purpose of word sense discrimination during query translation. The random indexing vector space method is used to accumulate the context vectors. Pair wise similarity of the context vectors of ambiguous terms with that of anchor terms indicated the possible correct translation of a query term. Two retrieval experiments wer...

متن کامل

Word Sense Disambiguation Using Random Indexing

This paper presents the results of an experiment to apply a novel semantic representational formalism called Random Indexing for the supervised word sense disambiguation of English words. Random Indexing uses high-dimensional sparse vectors with random patterns modeling neural activation patterns in the brain to represent linguistic information. The presented learning and disambiguating method ...

متن کامل

An Introduction to Random Indexing

Word space models enjoy considerable attention in current research on semantic indexing. Most notably, Latent Semantic Analysis/Indexing (LSA/LSI; Deerwester et al., 1990, Landauer & Dumais, 1997) has become a household name in information access research, and deservedly so; LSA has proven its mettle in numerous applications, and has more or less spawned an entire research field since its intro...

متن کامل

Comparing Word Sense Disambiguation and Distributional Models for Cross-Language Information Filtering

In this paper we deal with the problem of providing users with cross-language recommendations by comparing two different contentbased techniques: the first one relies on a knowledge-based word sense disambiguation algorithm that uses MultiWordNet as sense inventory, while the latter is based on the so-called distributional hypothesis and exploits a dimensionality reduction technique called Rand...

متن کامل

Discovering Word Senses from Text Using Random Indexing

Random Indexing is a novel technique for dimensionality reduction while creating Word Space model from a given text. This paper explores the possible application of Random Indexing in discovering word senses from the text. The words appearing in the text are plotted onto a multi-dimensional Word Space using Random Indexing. The geometric distance between words is used as an indicative of their ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013